Mining Patterns from Clinical Trial Annotated Datasets by Exploiting the NCI Thesaurus

نویسندگان

  • Joseph Benik
  • Guillermo Palma
  • Louiqa Raschid
  • Andreas Thor
  • Maria-Esther Vidal
چکیده

Annotations of clinical trials with controlled vocabularies of drugs and diseases, encode scientific knowledge that can be mined to discover relationships between scientific concepts. We present PAnG (Patterns in Annotation Graphs), a tool that relies on dense subgraphs, graph summarization and taxonomic distance metrics, computed using the NCI Thesaurus, to identify patterns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the Accessibility of a Thesaurus-Based Catalog by Web Content Mining

In this work we focus on the improvement of the accessibility of a catalog of radio and television productions. As productions can only be searched using the annotated meta-data, the use of a controlled vocabulary plays an important role in the retrieval. However, users can not be expected to have detailed knowledge on the terms defined within the vocabulary. In this work, we present a method t...

متن کامل

WordNet for Lexical Cohesion Analysis

This paper describes an approach to the analysis of lexical cohesion using WordNet. The approach automatically annotates texts with potential cohesive ties, and supports various thesaurus based and text based search facilities as well as different views on the annotated texts. The purpose is to be able to investigate large amounts of text in order to get a clearer idea to what extent semantic r...

متن کامل

Exploration Using Signatures in Annotation Graph Datasets

The widespread development and adoption of ontologies to capture semantic domain knowledge and the growth of annotation graph datasets has created many opportunities for large scale Linked Data analytics. Ontologies are developed by domain experts to capture knowledge specific to some domain. The biomedical community has taken the lead in these activities. Every model organism database has gene...

متن کامل

Towards Desiderata for an Ontology of Diseases for the Annotation of Biological Datasets

There is a plethora of disease ontologies available, all potentially useful for the annotation of biological datasets. We define seven desirable features for such ontologies and examine whether or not these features are supported by eleven disease ontologies. The four ontologies most closely aligned with our desiderata are Disease Ontology, SNOMED CT, NCI thesaurus and UMLS.

متن کامل

Text Mining: Extraction of Interesting Association Rule with Frequent Itemsets Mining for Korean Language from Unstructured Data

Text mining is a specific method to extract knowledge from structured and unstructured data. This extracted knowledge from text mining process can be used for further usage and discovery. This paper presents the method for extraction information from unstructured text data and the importance of Association Rules Mining, specifically for of Korean language (text) and also, NLP (Natural Language ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012